Lectue 5 - Exercise Problems: Solutions

Author

Shunkei Kakimoto

1 Loop

  1. Using the for loop, calculate the sum of the first n numbers for n = 1, 2, ..., 10. Save the results in a vector object.

Part 1

# --- Create an empty vector --- #
output_storage <- rep(0, 10)

# --- for loop --- #
for (i in 1:10){
  output_storage[i] <- sum(1:i)
}

Part 2

2 Combining Datasets using Loop

In the “corn_yield_by_sates” folder, you will find corn yield datasets(2000-2022) by state. Leveraging your expertise in R programming, use the loop function (or foreach()), load each dataset, and combine them into a single dataset.

Hint: You can use list.files () function. It is a built-in function in R that returns a character vector of file paths in the specified folder. The syntax is list.files(path = “path to the folder”, full.names = TRUE). The ‘full. names’ argument, when set to TRUE, returns the full file paths instead of just the file names. Then, you can use those file paths to load data. You need to specify the path to the target folder in the path argument.

Keep in mind there are various ways to approach this problem! 

Make sure that you open the R project fo this course. If so, the following code should work on your computer.

There are lots of approach. The most easy one is to use foreach function (in my opinion).

# --- Get a list of pathes to the target files --- #
ls_path_yield <- list.files(path = "Data/corn_yield_by_states", full.names = TRUE)

# --- Create an empty data.frame as a storage --- #
storage_yield_dt <- data.frame()

# --- for loop --- #
for (i in 1: length(ls_path_yield)){
  
  # Load a datset using i th path in ls_path_yield
  temp_dt <- readRDS(ls_path_yield[[i]])
  
  # Combine temp_dt and storage_yield_dt by row using rbind(). 
  # This way, storage_yield_dt is updated in each iteration.
  storage_yield_dt <- rbind(storage_yield_dt, temp_dt)
}

Another similar approach: You can use list() or vector() to store the datasets in the loop.

# --- Get a list of pathes to the target files --- #
ls_path_yield <- list.files(path = "Data/corn_yield_by_states", full.names = TRUE)

# --- Create an empty storage vector with the same length as  --- #
ls_yield_dt <- list()

# --- for loop --- #
for (i in 1: length(ls_path_yield)){
  
  # Load a datset using i th path in ls_path_yield
  temp_dt <- readRDS(ls_path_yield[[i]])
  
  # Save in the storage object
  ls_yield_dt[[i]] <- ls_yield_dt
}

# To combine the datasets in the list, use rbindlist() function() from data.table package.
yield_dt <- rbindlist(ls_yield_dt)
# --- Get a list of pathes to the target files --- #
ls_path_yield <- list.files(path = "Data/corn_yield_by_states", full.names = TRUE)

# --- for loop --- #
yield_dt <- foreach(file_path_i = ls_path_yield, .combine = rbind) %do% {
  # Load a datset
  temp_dt <- readRDS(file_path_i)
  
  return(temp_dt) 
}

The foreach function returns the last value of the loop by default. In this case, the temp_dt object is returned in each iteration, so return(temp_dt) is unnecessary. Once all iterations are completed, the output datasets are combined by row (because I specified .combine = rbind).

3 Functions (and Loops)

  1. Write a function to calculate the area of a circle with a given radius. The function should return the area of the circle.

  2. The factorial of a non-negative integer \(n\), \(n!\) is the product of all positive integers between 1 and \(n\)(e.g., \(5! = 5 \times 4 \times 3 \times 2 \times 1 = 120\).) Create a function named calculate_factorial that takes a single integer argument n. The function should return the value of \(n!\). Assume that n does not take zero for simplicity.

  3. Write a function that calculates the sum of the first n numbers for n = 1, 2, ..., 10. The function should return the results as a vector object.

  4. Write a function to count how many odd numbers there are in a provided vector containing a series of integers.

Part 1

# --- Create a function --- #
circle_area <- function(radius){
  return(pi * radius^2)
}
# --- Create a function --- #
calculate_factorial <- function(n){
  return(prod(1:n))
}

# --- Test --- #
calculate_factorial(0)


# --- A bit more sophisticated way --- #
calculate_factorial <- function(n) {
  if (n %in% c(0, 1) {
    return(1)
  } else {
    return(prod(1:n))
  }
}

Part 2

# --- Create a function --- #
sum_n <- function(n){
  output_storage <- rep(0, n)
  
  for (i in 1:n){
    output_storage[i] <- sum(1:i)
  }
  
  return(output_storage)
}

Part 3

# --- Using loop --- #
num.odd <- function(v){
  count <- 0
  
  for (i in 1:length(v)){
    if (v[i] %% 2 != 0){
      count <- count + 1
    }
  }
  
  return(count)
}

3.1 Functions

You’re a data expart at a store chain. The company needs to study its its monthly sales growth to plan better. They expect sales to grow by a fixed percentage each month. Your job is to create an R function that shows sales growth over a year.

For sales growth, use the following formula

\[S_t = S_0 \times (1 + g)^{t-1}\]

, where \(S_t\) is the sales in month \(t\) , \(S_0\) is the starting sales, and \(g\) is the growth rate.

Create a function called monthly_sales_growth with the following three inputs:

  • initial_sales: Starting sales (in thousands of dollars).
  • growth_rate: Monthly growth rate (as a decimal, like 0.03 for 3% growth).
  • months: How many months to predict (usually 12 for a year).

The function should give back a vector of numbers (or it would be nicer if you could show in a data.frame or data.table in which two columns, e.g., month and sales, show the expected sales for each month.)

Part 1

monthly_sales_growth <- function(initial_sales, growth_rate, months, discount_rate = 0){
  sales <- rep(0, months)
  
  for (i in 1:months){
    sales[i] <- initial_sales * (1 + growth_rate)^(i-1) * (1 - discount_rate)
  }
  
  return(sales)
}

If you don’t use for loop, you can use the following code:

monthly_sales_growth <- function(initial_sales, growth_rate, months) {
  # Generate a sequence of months
  month_seq <- 1:months
  
  # Calculate the sales growth for each month
  sales <- initial_sales * (1 + growth_rate) ^ (month_seq - 1)

  return(sales)
}

# --- Test --- #
monthly_sales_growth(initial_sales = 50, growth_rate = 0.03, months = 12)